Ph.D. Dissertation Reliable Extraction of Realistic 3D Facial Animation Parameters from Mirror-reflected Multi-view video clips
نویسنده
چکیده
With the rapid development of facial animation and facial motion analysis, the necessity of motion capture techniques increases dramatically. However, existing motion capture devices are still very expensive and have specific limitations. In this dissertation, an accurate and inexpensive procedure for estimating 3D facial and lip motion trajectories from mirror-reflected multi-view video is proposed. Two plane mirrors are located near a subject’s cheeks and a single digital video camcorder is utilized to capture markers’ front and side view images on a face simultaneously without special synchronization mechanisms. A novel closed-form linear algorithm is proposed to reconstruct 3D positions from real vs. mirrored point correspondences, where the extrinsic environment parameters do not need to be calibrated in advance. Since nice symmetric properties of mirrored objects are exploited, our computer simulations and expected error estimation manifest that the proposed 3D position estimation approach is more robust against noise, more accurate and simpler than general-purpose stereovision approaches by a linear algorithm or maximum likelihood optimization. In our experiments, a root mean square (RMS) error less than 2mm in 3D space can be reached while we use only 20 arbitrary point-corresponding pairs to evaluate the orientations and locations of mirror planes. For 3D facial motion extraction, our proposed procedure can track markers semi-automatically under normal light conditions. Adaptive Kalman predictors and filters are employed to improve the tracking stability and to conjecture the occluded markers’ positions. The motion tracking can be fully automatic with fluorescent markers illuminated by ultraviolet(UV) “blacklight blue”(BLB) lamps. For the problems of missing marker and false marker detection as well as false tracking, we employ the spatial coherence on face surfaces and the temporal coherence in motion to judge, rectify and compensate false tracking trajectories automatically. More than 300 markers on a subject’s face and lips are tracked from 30 fps video clips. This system will be extended for real-time tracking from live video in the near future. The estimated 3D facial motion data have also been practically applied to our facial animation system. In addition, a web-enabled talking head is also proposed, where facial animation is driven by natural speech. A speech analysis module is employed to obtain the corresponding phoneme sequence within the input speech, and then they are converted to the MPEG-4 high-level facial animation parameters called visemes to drive a 3D head model performing corresponding facial expressions. The talking head has been developed as plug-ins for web browsers and requires only 6 Kbps to stream high-resolution animation through Internet. Furthermore, my work was also used in a collaborative project between INRIA of France and National Taiwan University for a French-driven talking head system.
منابع مشابه
Realistic 3D facial animation parameters from mirror-reflected multi-view video
In this paper, a robust, accurate and inexpensive approach to estimate 3D facial motion from multi-view video is proposed, where two mirrors located near one’s cheeks can reflect the side views of markers on one’s face. Nice properties of mirrored images are utilized to simplify the proposed tracking algorithm significantly, while a Kalman filter is employed to reduce the noise and to predict t...
متن کاملExtracting 3D Facial Animation Parameters from Multiview Video Clips
synthetic face’s behaviors must precisely conform to those of a real one. However, facial surface points, being nonlinear and without rigid body properties, have quite complex action relations. During speaking and pronunciation, facial motion trajectories between articulations, called coarticulation effects, also prove nonlinear and depend on preceding and succeeding articulations. Performance-...
متن کاملExtracting 3D facial animation parameters from multiview video clips - Computer Graphics and Applications, IEEE
synthetic face’s behaviors must precisely conform to those of a real one. However, facial surface points, being nonlinear and without rigid body properties, have quite complex action relations. During speaking and pronunciation, facial motion trajectories between articulations, called coarticulation effects, also prove nonlinear and depend on preceding and succeeding articulations. Performance-...
متن کاملExtraction of 3D facial motion parameters from mirror-reflected multi-view video for audio-visual synthesis
The goal of our project is to collect the dataset of 3D facial motion parameters for the synthesis of talking head. However, the capture of human facial motion is usually an expensive task in some related researches, since special devices must be applied, such as optical or electronic trackers. In this paper, we propose a robust, accurate and inexpensive approach to estimate human facial motion...
متن کاملMoCap : Automatic and efficient capture of dense 3 D facial motion parameters from video
M. Ouhyoung Dept. of Computer Science and Information Engineering, National Taiwan University, No.1 Roosevelt Rd. Sec. 4, Taipei, 106, Taiwan e-mail: [email protected] Abstract In this paper, we present an automatic and efficient approach to the capture of dense facial motion parameters, which extends our previous work of 3D reconstruction from mirror-reflected multiview video. To narrow sea...
متن کامل